top of page
Iahlt

Resources

The Mastermind Behind GPT-4 and the Future of AI | Ilya Sutskever

The Mastermind Behind GPT-4 and the Future of AI | Ilya Sutskever

In this podcast episode, Ilya Sutskever, the co-founder and chief scientist at OpenAI, discusses his vision for the future of artificial intelligence (AI), including large language models like GPT-4. Sutskever starts by explaining the importance of AI research and how OpenAI is working to advance the field. He shares his views on the ethical considerations of AI development and the potential impact of AI on society. The conversation then moves on to large language models and their capabilities. Sutskever talks about the challenges of developing GPT-4 and the limitations of current models. He discusses the potential for large language models to generate a text that is indistinguishable from human writing and how this technology could be used in the future. Sutskever also shares his views on AI-aided democracy and how AI could help solve global problems such as climate change and poverty. He emphasises the importance of building AI systems that are transparent, ethical, and aligned with human values. Throughout the conversation, Sutskever provides insights into the current state of AI research, the challenges facing the field, and his vision for the future of AI. This podcast episode is a must-listen for anyone interested in the intersection of AI, language, and society. Timestamps: 00:04 Introduction of Craig Smith and Ilya Sutskever. 01:00 Sutskever's AI and consciousness interests. 02:30 Sutskever's start in machine learning with Hinton. 03:45 Realization about training large neural networks. 06:33 Convolutional neural network breakthroughs and imagenet. 08:36 Predicting the next thing for unsupervised learning. 10:24 Development of GPT-3 and scaling in deep learning. 11:42 Specific scaling in deep learning and potential discovery. 13:01 Small changes can have big impact. 13:46 Limits of large language models and lack of understanding. 14:32 Difficulty in discussing limits of language models. 15:13 Statistical regularities lead to better understanding of world. 16:33 Limitations of language models and hope for reinforcement learning. 17:52 Teaching neural nets through interaction with humans. 21:44 Multimodal understanding not necessary for language models. 25:28 Autoregressive transformers and high-dimensional distributions. 26:02 Autoregressive transformers work well on images. 27:09 Pixels represented like a string of text. 29:40 Large generative models learn compressed representations of real-world processes. 31:31 Human teachers needed to guide reinforcement learning process. 35:10 Opportunity to teach AI models more skills with less data. 39:57 Desirable to have democratic process for providing information. 41:15 Impossible to understand everything in complicated situations. Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI
מבוא ללשון עברית: מערכות הכללים בשפה

מבוא ללשון עברית: מערכות הכללים בשפה

לסרטונים נוספים: http://www.naamaglobal.co.il שיעורים אונליין בלשון - הכנה לבגרות. סרטון מבוא לסדרת סרטונים בנושא הלשון העברית. סקירה על קצה המזלג של מערכות הכללים השונות בשפה, לפני שאנחנו צוללים ולומדים על התחביר העברי ועל מערכת הצורות. לכל שפה יש מערכות כללים מסוגים שונים. מערכות אלה נקראות פונולוגיה, סמנטיקה, פרגמטיקה, מורפולוגיה, תחביר, וכל אחת מהן בוחנת את כללי השפה מזווית קצת שונה, כל מערכת כללים חוקרת היבט אחר של השפה. הפונולוגיה, או תורת ההגה בעברית, חוקרת את מערכת הצלילים של שפה. כלל ההגאים בשפה מסוימת הוא אוסף העיצורים שלה (האותיות ללא ניקוד) ואוסף התנועות שלה (כמו פתח, חיריק, קמץ וכן הלאה). יש עיצורים שפתיים כמו ב, גרוניים כמו ח, אטומים, שורקים ועוד. יש תנועות אחוריות, קדמיות, ארוכות, קצרות, חטופות. ההגאים שונים משפה לשפה. למשל, במילה העברית תפוח, קיים הצליל ח', שאינו קיים בשפות אחרות, כמו למשל אנגלית. הסמנטיקה היא תחום מחקר החוקר את התוכן המועבר באמצעות השפה, את המשמעות של מילים וביטויים. הסמנטיקה מבחינה בין המסמן לבין מה שמסומן באמצעותו. המסמן מתייחס למילה עצמה, אשר יכולה להיות מושמעת כרצף צלילים (תפוח) או כתובה כרצף אותיות, ואילו המסומן מתייחס למשמעות שהמילה מסמלת, לתוכן של המילה, למה שעומד מאחוריה. תוכן המילה כולל את כל הידיעות, המחשבות והרגשות המקושרים אליה‏. גם הפרגמטיקה חוקרת את המשמעות של מילים וביטויים, אבל בהקשר של שיחה, כלומר הפרגמטיקה חוקרת את האופן שבו אנחנו משתמשים בשפה לתקשורת עם אנשים אחרים ולהשגת המטרות שלנו. מערכת הכללים הזאת לוקחת בחשבון את השותף שלנו לשיחה, את הנסיבות, את ההקשר וגם מוסכמות חברתיות. אנחנו יכולים לבחור מבין מספר דרכים אפשריות את הדרך שבה נעביר את המסר שלנו. למשל אם אנחנו רוצים תפוח מאדם אחר, נוכל לומר בציווי: "תביא לי מיד תפוח!", "תן לי תפוח", או נוכל לשאול או לבקש, "אפשר תפוח?", "תוכל בבקשה לתת לי תפוח?" חקר כללי התצורה של שפה, נקרא מורפולוגיה, או תורת הצורות בעברית, והיא עוסקת במבנה המילה. את המילים בכל שפה אפשר לחלק ליחידות משמעות פשוטות יותר. היחידה הקטנה ביותר הנושאת משמעות נקראת צורן. כך, למשל, את המילה "תפוחים" ניתן לפרק לשני צורנים "תפוח – ים". הצורן הראשון מייצג פרי מסוים – "תפוח" והצורן השני מייצג את המשמעות של כמות: "יותר מתפוח אחד". מערכת הכללים האחרונה שנדון בה היא מערכת כללי התחביר. לכל שפה יש כללים הקובעים את מבנה המשפט. משפט הוא קבוצת מילים המסודרות בסדר הגיוני ומביעות רעיון מסוים. שינוי סדר המילים במשפט יכול לגרום לשינוי משמעות המשפט. לדוגמה: "הצלחת על התפוח" במקום "התפוח על הצלחת". בסרטונים הבאים נלמד להכיר את כללי התחביר העברי וכללי מערכת הצורות העברית. לעדכונים על סרטונים נוספים אתם מוזמנים להירשם לערוץ היוטיוב ולעשות לייק לדף הפייסבוק נעמה גלובל. להתראות בסרטון הבא!
Steven Pinker: Linguistics as a Window to Understanding the Brain | Big Think

Steven Pinker: Linguistics as a Window to Understanding the Brain | Big Think

In this lecture, Steven Pinker, renowned linguist and Harvard Psychology Professor, discusses linguistics as a window to understanding the human brain. New videos DAILY: https://bigth.ink Join Big Think Edge for exclusive video lessons from top thinkers and doers: https://bigth.ink/Edge ---------------------------------------------------------------------------------- How is it that human beings have come to acquire language? Steven Pinker's introduction to the field includes thoughts on the evolution of spoken language and the debate over the existence of an innate universal grammar, as well as an exploration of why language is such a fundamental part of social relationships, human biology, and human evolution. Finally, Pinker touches on the wide variety of applications for linguistics, from improving how we teach reading and writing to how we interpret law, politics, and literature. Read the full transcript on: https://bigthink.com/videos/how-we-speak-reveals-how-we-think-with-steven-pinker ---------------------------------------------------------------------------------- Steven Pinker is an experimental psychologist who conducts research in visual cognition, psycholinguistics, and social relations. He grew up in Montreal and earned his BA from McGill and his PhD from Harvard. Currently Johnstone Professor of Psychology at Harvard, he has also taught at Stanford and MIT. He has won numerous prizes for his research, his teaching, and his nine books, including The Language Instinct, How the Mind Works, The Blank Slate, The Better Angels of Our Nature, The Sense of Style, and Enlightenment Now: The Case for Reason, Science, Humanism, and Progress. ---------------------------------------------------------------------------------- ABOUT BIG THINK: Smarter Faster™ Big Think is the leading source of expert-driven, actionable, educational content. With thousands of videos, featuring experts ranging from Bill Clinton to Bill Nye, we help you get smarter, faster. ​Our experts are either disrupting or leading their respective fields—subscribe to learn from top minds like these daily. We aim to help you explore the big ideas and core skills that define knowledge in the 21st century, so you can apply them to the questions and challenges in your own life. Other Frequent contributors include Michio Kaku & Neil DeGrasse Tyson. Michio Kaku Playlist: https://bigth.ink/Kaku Bill Nye Playlist: https://bigth.ink/BillNye Neil DeGrasse Tyson Playlist: https://bigth.ink/deGrasseTyson Read more at https://bigthink.com for a multitude of articles just as informative and satisfying as our videos. New articles posted daily on a range of intellectual topics. Join Big Think Edge, to gain access to an immense library of content. It features insight from many of the most celebrated and intelligent individuals in the world today. Topics on the platform are focused on: emotional intelligence, digital fluency, health and wellness, critical thinking, creativity, communication, career development, lifelong learning, management, problem solving & self-motivation. BIG THINK EDGE: https://bigth.ink/Edge ---------------------------------------------------------------------------------- FOLLOW BIG THINK: 📰BigThink.com: https://bigth.ink 🧔Facebook: https://bigth.ink/facebook 🐦Twitter: https://bigth.ink/twitter 📸Instagram: https://bigth.ink/Instragram 📹YouTube: https://bigth.ink/youtube ✉ E-mail: info@bigthink.com ---------------------------------------------------------------------------------- TRANSCRIPT: For more info on this video, including the full transcript, check out https://bigthink.com/big-think-edge/learn-better
GPT-3: Language Models are Few-Shot Learners (Paper Explained)

GPT-3: Language Models are Few-Shot Learners (Paper Explained)

#gpt3 #openai #gpt-3 How far can you go with ONLY language modeling? Can a large enough language model perform NLP task out of the box? OpenAI take on these and other questions by training a transformer that is an order of magnitude larger than anything that has ever been built before and the results are astounding. OUTLINE: 0:00 - Intro & Overview 1:20 - Language Models 2:45 - Language Modeling Datasets 3:20 - Model Size 5:35 - Transformer Models 7:25 - Fine Tuning 10:15 - In-Context Learning 17:15 - Start of Experimental Results 19:10 - Question Answering 23:10 - What I think is happening 28:50 - Translation 31:30 - Winograd Schemes 33:00 - Commonsense Reasoning 37:00 - Reading Comprehension 37:30 - SuperGLUE 40:40 - NLI 41:40 - Arithmetic Expressions 48:30 - Word Unscrambling 50:30 - SAT Analogies 52:10 - News Article Generation 58:10 - Made-up Words 1:01:10 - Training Set Contamination 1:03:10 - Task Examples https://arxiv.org/abs/2005.14165 https://github.com/openai/gpt-3 Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general. Authors: Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei Links: YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher
Arabic Influence on Modern Hebrew!!

Arabic Influence on Modern Hebrew!!

This video is all about how the Arabic language has influenced Modern Hebrew! 🚩 Learn Hebrew and Arabic with HebrewPod101 ( http://bit.ly/HebrewPod ) and ArabicPod101 ( http://bit.ly/arabicpod101 ). (Full disclosure: if you sign up for a premium account, Langfocus receives a small referral fee. But the free account is great too!) Special thanks to Daniel Shakarov for his Hebrew audio samples, and Ahmed Souhad for his Arabic audio samples! 🚩 Support Langfocus on Patreon: http://patreon.com/langfocus Current Patrons include: Andres Resendez Borgia, Andrew Heckenberg, Anjo Barnes, Auguste Fields, Behnam Esfahbod, Bennett Seacrist, Brandon Gonzalez, Can Cetinyilmaz, Clark Roth, Fiona de Visser, Guillermo Jimenez, Jacob Madsen, John Moffat, Marcelo Loureiro, Matthew Etter, Michael Arbagi, Michael Cuomo, Nobbi Lampe-Strang, Patrick W., Rosalind Resnick, Ruben Sanchez Jr, Sebastian Langshaw, ShadowCrossZero, Victoria Goh, Vincent David, Yuko Sunda, Adam Powell, Adam Vanderpluym, Alberto del Angel, Alen, Alex Hanselka, Ali Muhammed Alshehri, Alvin Quiñones, Andrew Woods, Angeline Biot, Aous Mansouri, Ashley Dierolf, Atsushi Yoshida, Avital Levant, Bartosz Czarnotta, Brent Warner, Brian Begnoche, Brian Morton, Bruce Stark, Carl saloga, Charis T'Rukh, Chelsea Boudreau, Christian Langreiter, Christopher Lowell, David LeCount, Debbie Levitt, Diane Young, DickyBoa, divad, Divadrax, Don Ross, Donald Tilley, Edward Wilson, Eric Loewenthal, Erin Robinson Swink, Fabio Martini, fatimahl, Grace Wagner, Gus Polly, Hannes Egli, Harry Kek, Henri Saussure, Herr K, Ina Mwanda, Jack Jackson, James and Amanda Soderling, James Lillis, Jay Bernard, Jens Aksel Takle, JESUS FERNANDO MIRANDA BARBOSA, JK Nair, JL Bumgarner, Justin Faist, Kevin J. Baron, Klaw117, Konrad, Kristian Erickson, Krzysztof Dobrzanski, Laura Morland, Lee Dedmon, Leo Coyne, Leo Barudi, Lincoln Hutton, Lorraine Inez Lil, Luke Jensen, M.Aqeel Afzal, Mahmoud Hashemi, Margaret Langendorf, Maria Comninou, Mariana Bentancor, Mark, Mark Grigoleit, Mark Kemp, Markzipan, Maurice Chou, Merrick Bobb, Michael Regal, Mike Frysinger, mimichi, Mohammed A. Abahussain, Nicholas Gentry, Nicole Tovar, Oleksandr Ivanov, Oto Kohulák, Panot, Papp Roland, Patrick smith, Patriot Nurse, Paul Shutler, Pauline Pavon, Paulla Fetzek, Peter Andersson, Peter Nikitin, Peter Scollar, Pomax, Raymond Thomas, Renato Paroni de Castro, Robert Sheehan, Robert Williams, Roland Seuhs, Ronald Brady, Ryan Lanham, Saffo Papantonopoulou, Samuel Croes, Scott Irons, Scott Russell, Sergio Pascalin, Shoji AKAO, ShrrgDas, Sierra Rooney, Simon Blanchet, Simon G, Spartak Kagramanyan, Steeven Lapointe, Stefan Reichenberger, Steven Severance, Suzanne Jacobs, Theophagous, Thomas Chapel, Tomáš Pauliček, Tryggurhavn, veleum, William MacKenzie, William O Beeman, William Shields, yasmine jaafar, Yeshar Hadi, Éric Martin. Sources include: The Renaissance of Modern Hebrew and Modern Standard Arabic: Parallels and Differences in the Revival of Two Semitic Languages. Joshua Blau. 40-42. “Arabic Loanwords in Modern Hebrew". Haseeb Shehadeh. ENCYCLOPEDIA OF HEBREW LANGUAGE AND LINGUISTICS Volume 1 (A-F). 149-152. Rasmī or aslī?: Arabic’s impact on Israeli Hebrew. D Gershon Lewental, DGLnotes, 27 January 2012. http://dglnotes.com/notes/arabic-hebrew.htm Moroccan Arabic's Influence on Modern Hebrew. "Foreigncy" podcast, Oct. 14 2018. Guest: Dr. Jonas Sibony, professor of Modern Hebrew, University of Strasbourg. Arabic Influence: Modern Period. Roni Henkin. ENCYCLOPEDIA OF HEBREW LANGUAGE AND LINGUISTICS Volume 1 (A-F). 143-149. https://www.academia.edu/6747639/Arabic_influence_Modern_period.pdf. Eliezer Ben-Yehuda Is Turning in His Grave Over Israel’s Humiliation of Arabic. Seraj Assi. https://www.haaretz.com/opinion/.premium-eliezer-ben-yehuda-is-turning-in-his-grave-over-israels-humiliation-of-arabic-1.5472510 Music: "Time Illusionist" by Asher Fulero. The following images were used under Creative Commons Sharealike 3.0 license: https://en.wikipedia.org/wiki/Afroasiatic_languages#/media/File:Hamito-Semitic_languages.jpg. Author: Listorien, Anak 1. https://commons.wikimedia.org/wiki/Category:Ashkelon#/media/File:Ashqelon2011-2.jpg. Author: Oyoyoy Still images which include the above images are available for use under the same Creative Commons Sharealike 3.0 license.
The ARABIC Language (Its Amazing History and Features)

The ARABIC Language (Its Amazing History and Features)

This video is all about the Arabic language, from its early origins on the Arabian peninsula, to its current status as the 5th most spoken language on Earth. I also examine a number of features of Arabic. ▶ Learn Arabic: http://bit.ly/arabicpod101 ◀ *Black Friday sale: Courses are currently 51% off for a limited time!* (Full disclosure: if you sign up for a paid membership, Langfocus receives a small referral fee.) Special thanks to Murjana Shabaneh and Mohammad Abd Al Qadr for the audio samples and feedback! 🔹🔷 Check out Langfocus on Patreon http://patreon.com/langfocus 🔷🔹 Current Patreon members include these fantastic people: Brandon Gonzalez, Виктор Павлов, Mark Thesing, Jiajun "Jeremy" Liu, иктор Павлов, Guillermo Jimenez, Sidney Frattini Junior, Bennett Seacrist, Ruben Sanchez, Michael Cuomo, Eric Garland, Brian Michalowski, Sebastian Langshaw, Vadim Sobolev, FRANCISCO, Mohammed A. Abahussain, Fred, UlasYesil, JL Bumgarner, Rob Hoskins, Thomas A. McCloud, Ian Smith, Maurice Chow, Matthew Cockburn, Raymond Thomas, Simon Blanchet, Ryan Marquardt, Sky Vied, Romain Paulus, Panot, Erik Edelmann, Bennet, James Zavaleta, Ulrike Baumann, Ian Martyn, Justin Faist, Jeff Miller, Stephen Lawson, Howard Stratton, George Greene, Panthea Madjidi, Nicholas Gentry, Sergios Tsakatikas, Bruno Filippi, Sergio Tsakatikas, Qarion, Pedro Flores, Raymond Thomas, Marco Antonio Barcellos Junior, David Beitler, Rick Gerritzen, Sailcat, Mark Kemp, Éric Martin, Leo Barudi, Piotr Chmielowski, Suzanne Jacobs, Johann Goergen, Darren Rennels, Caio Fernandes, Iddo Berger, Peter Nikitin, Brent Werner, Fiona de Visser, Carl Saloga, Edward Wilson, Kevin Law, David Lecount, Joshua Philgarlic, for their generous Patreon support. Video chapters: 00:00 Introduction 00:32 General Information about the Arabic Language 01:07 Varieties of Arabic 02:06 Arabic is Semitic language 02:22 Old Arabic 03:51 Classical Arabic 05:04 Neo-Arabic & Middle Arabic 06:02 Modern Arabic 06:47 Diglossia in Arabic 08:21 The Arabic script 09:24 Arabic phonology 10:30 Morphology in the Arabic language 11:36 Verbs in Arabic 13:05 Word order in Arabic 14:00 Cases in Arabic 15:05 Sentence breakdown 16:30 Final comments 17:22 The Question of the Day Music: You're free to use this song and monetize your video, but you must include the following in your video description: Ibn Al-Noor by Kevin MacLeod is licensed under a Creative Commons Attribution license (https://creativecommons.org/licenses/by/4.0/) Source: http://incompetech.com/music/royalty-free/index.html?isrc=USUAN1100706 Artist: http://incompetech.com/ "Raw Deal" by Gunnar Olsen. "In Case You Forgot" by Otis McDonald. Drum beat from: https://www.youtube.com/watch?v=fVvWgpBHNL0 Images: "Arabic Speaking World" map courtesy of Keteracel at English Wikipedia. https://commons.wikimedia.org/wiki/File:Arabic_speaking_world.svg
Natural Language Processing

Natural Language Processing

Natural Language Processing is a field of Artificial Intelligence dedicated to enabling computers to understand and communicate in human language. NLP is only a few decades old, but we've made significant progress in that time. I'll cover how its changed over the years, then show you how you can easily build an NLP app that can either classify or summarize text. This is incredibly powerful technology that anyone can freely use, I'll show you how to do it. Enjoy! Code for this video: https://github.com/llSourcell/bert-as-service Please Subscribe! And like. And comment. That's what keeps me going. Want more education? Connect with me here: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology instagram: https://www.instagram.com/sirajraval More Learning resources: https://www.youtube.com/watch?v=0n95f-eqZdw http://mlexplained.com/2019/01/30/an-in-depth-tutorial-to-allennlp-from-basics-to-elmo-and-bert/ https://towardsdatascience.com/beyond-word-embeddings-part-2-word-vectors-nlp-modeling-from-bow-to-bert-4ebd4711d0ec https://gluon-nlp.mxnet.io/examples/sentence_embedding/bert.html Join us in the Wizards Slack channel: http://wizards.herokuapp.com/ Join us at the School of AI: https://theschool.ai/ And please support me on Patreon: https://www.patreon.com/user?u=3191693 Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.co

Consolidating and Exploring Open Textual Knowledge Prof. Ido Dagan, Bar Ilan University >> Click here

מבוא לשפה - עיבוד ממוחשב של שפה אנושית עם פרופסור עידו דגן  With Spotify

START WIT NLP

Start with NLP

Recommended courses: 
https://www.coursera.org/specializations/natural-language-processing

 

Recommended textbook, available online: 
https://web.stanford.edu/~jurafsky/slp3/

 

It also provides great little introductions to many fields of linguistics before you hop into the computational part.

NLP Tutorials Part -I from Basics to Advance

https://www.analyticsvidhya.com/blog/2022/01/nlp-tutorials-part-i-from-basics-to-advance/ 

 

Natural Language Processing with Python

https://www.nltk.org/book/

 

100 ChatGPT terms explained from NLP to Entity Extraction

https://www.geeky-gadgets.com/chatgpt-terms-explained/

Natural Language Processing In Healthcare

https://www.routledge.com/Natural-Language-Processing-In-Healthcare-A-Special-Focus-on-Low-Resource/Dash-Parida-Tello-Acharya-Bojar/p/book/9780367685393

Natural Language Processing Specialization

https://www.coursera.org/specializations/natural-language-processing#courses

Hebrew NLP resources

Hebrew NLP Resources

https://github.com/NNLP-IL/Resources


NNLP-IL Hebrew and Arabic NLP Resources

https://resources.nnlp-il.mafat.ai

Hebrew Handwritten Text Recognizer (OCR)

https://github.com/Lotemn102/HebHTR

 

מאגרי מידע ושת"פים אפשריים 
https://docs.google.com/spreadsheets/d/1fGYKyA5Jf_KPCXPCpRWGfRzjDc6ALp9dgKnbIXqxM_Y/edit#gid=0

 

חוות דעת: שימושים בתכנים מוגנים בזכויות יוצרים לצורך למידת מכונה

https://www.gov.il/he/departments/legalInfo/machine-learning

Israel's Policy on Artificial Intelligence Regulation and Ethics (Ministry of Innovation, Science and Technology)

https://www.gov.il/en/departments/policies/ai_2023

 
Open source

Open Source

Github​

NLP
https://github.com/topics/natural-language-processing

 

Speech

https://github.com/topics/speech

spaCy · Industrial-strength Natural Language Processing in Python
https://spacy.io/

Stanza – A Python NLP Package for Many Human Languages

Created by the Stanford NLP Group

https://stanfordnlp.github.io/stanza/a

Open Source OCR

https://github.com/tesseract-ocr/tesseract

Speech Recognition - Whisper (OpenAI)

https://cdn.openai.com/papers/whisper.pdf

Unsupervised

Large language model (LLM)

Open LLMs List

AWS Bedrock

https://aws.amazon.com/bedrock/

Google Gemma

https://ai.google.dev/gemma

Google Gemini

https://deepmind.google/technologies/gemini/#introduction

Meta Llama 2
https://ai.meta.com/llama/


https://github.com/eugeneyan/open-llms

What’s before GPT-4? A deep dive into ChatGPT

https://medium.com/digital-sense-ai/whats-before-gpt-4-a-deep-dive-into-chatgpt-dfce9db49956

GPT-4 Training process

Like previous GPT models, the GPT-4 base model was trained to predict the next word in a document, and was trained using publicly available data (such as internet data) as well as data we’ve licensed. The data is a web-scale corpus of data including correct and incorrect solutions to math problems, weak and strong reasoning, self-contradictory and consistent statements, and representing a great variety of ideologies and ideas.

So when prompted with a question, the base model can respond in a wide variety of ways that might be far from a user’s intent. To align it with the user’s intent within guardrails, we fine-tune the model’s behavior using reinforcement learning with human feedback (RLHF).

Note that the model’s capabilities seem to come primarily from the pre-training process—RLHF does not improve exam performance (without active effort, it actually degrades it). But steering of the model comes from the post-training process—the base model requires prompt engineering to even know that it should answer the questions.

https://openai.com/research/gpt-4

BERT
https://github.com/google-research/bert

AlephBERT

https://github.com/OnlpLab/AlephBERT
https://arxiv.org/pdf/2104.04052.pdf

Multi-language Aspects
How Language-Neutral is Multilingual BERT?​

https://arxiv.org/pdf/1911.03310.pdf

AraBERT: Transformer-based Model for Arabic Language Understanding
https://arxiv.org/pdf/2003.00104.pdf
 
ELM
o

https://allennlp.org/elmo

 

LaBSE - Language-agnostic BERT sentence embedding model supporting 109 languages.

https://tfhub.dev/google/LaBSE/2

LaBSE model to PyTorch. It can be used to map 109 languages to a shared vector space.
https://huggingface.co/sentence-transformers/LaBSE

Claude is a large language model (LLM) built by Anthropic.
It's trained to be a
helpful assistant in a conversational tone.

https://docs.anthropic.com/claude/docs/getting-started-with-claude

 

Jais - open-source Arabic Large Language Model (LLM)

https://huggingface.co/core42/jais-13b/tree/main

https://inceptioniai.org/jais/

Falcon LLM Falcon 40B was the world’s top-ranked open-source AI model when launched

https://falconllm.tii.ae/falcon.html

 

Mistral AI brings a strong open generative models to the developers, along with efficient ways to deploy and customise them for production.

https://mistral.ai

Mamba: Linear-Time Sequence Modeling with Selective State Spaces
https://huggingface.co/papers/2312.00752

https://www.together.ai/blog/mamba-3b-slimpj

bottom of page