How do Large Language Models compare to NLP toolkits for NLP tasks?
I need to do some NLP on text in a number of different languages (English, Spanish, Russian etc). I've experimented using spaCy, stanza and NLTK, as well as some LLMs like ChatGPT, Bard, LLaMa 2 and GPT-4, to do things like lemmatization and POS tagging.
In my experimentation, GPT-4 with adequate prompting outperformed everything else in every language. I wasn't able to spot any errors.
The other LLMs were more or less on par with NLP toolkits: LLMs were a bit more robust to imperfections in the input strings (typos, weird punctuation etc), but were more likely to make very simple mistakes too.
​
Have you guys tried to use LLMs for NLP?
Can you confirm my experimental results, or did you get a different outcome?
Is anyone trying to take advantage of the power of LLMs for these tasks? For instance, is anyone trying to extract NLP features from the insides of models like LLaMa 2?
/r/LanguageTechnology
https://redd.it/16gtrk4
R Unveiling theory of mind in large language models: A parallel to single neurons in the human brain - Harvard University 2023
Paper: https://arxiv.org/abs/2309.01660
Abstract:
>With their recent development, large language models (LLMs) have been found to exhibit a certain level of Theory of Mind (ToM), a complex cognitive capacity that is related to our conscious mind and that allows us to infer another's beliefs and perspective. While human ToM capabilities are believed to derive from the neural activity of a broadly interconnected brain network, including that of dorsal medial prefrontal cortex (dmPFC) neurons, the precise processes underlying LLM's capacity for ToM or their similarities with that of humans remains largely unknown. In this study, we drew inspiration from the dmPFC neurons subserving human ToM and employed a similar methodology to examine whether LLMs exhibit comparable characteristics. Surprisingly, our analysis revealed a striking resemblance between the two, as hidden embeddings (artificial neurons) within LLMs started to exhibit significant responsiveness to either true- or false-belief trials, suggesting their ability to represent another's perspective. These artificial embedding responses were closely correlated with the LLMs' performance during the ToM tasks, a property that was dependent on the size of the models. Further, the other's beliefs could be accurately decoded using the entire embeddings, indicating the presence of the embeddings' ToM capability at the population level. Together, our findings revealed an emergent property of LLMs' embeddings that modified their activities in response to ToM features, offering initial evidence of a parallel between the artificial model and neurons in the human brain.
​
https://preview.redd.it/2wduugp4svnb1.png?width=1098&format=png&auto=webp&s=d59878eec6a6570a15ac2a3f9d3485a3c140eb73
https://preview.redd.it/qkobarp4svnb1.png?width=1094&format=png&auto=webp&s=08c17207e282effc21149984e88e143f0878c154
https://preview.redd.it/qz9zydp4svnb1.png?width=1116&format=png&auto=webp&s=a08f4257235a60597ec9a85be3cd6c7df409d755
https://preview.redd.it/c0v4qmp4svnb1.png?width=1143&format=png&auto=webp&s=62c238c1bde2bce7e56de5e738ad2abce71d042d
/r/MachineLearning
https://redd.it/16h1tup
I've created a neural network library in c++ and trained image super resolution in it, the results are surprisingly good.
Hey.
To cut the story short, I've created a library in C++ from scratch using only the Eigen library (still writing most algorithms by hand because of terrible Eigen performance). Anyways I've been experimenting with image super resolution for the past 2 weeks, and I finally found the correct formula for creating a reasonably performing image upscaler.
I'm using a really small network with only 5 convolutional layers of really small kernel sizes (5 and 3) and pixel shuffle layer at the end. The network is trained to correct the error of bicubic interpolation, rather than upscaling the image directly, and thats the reason why it might be performing so 'well', but you can be the judge of that...
Here is an example of upscaled image by the network:
2x Image upscaling
And of course my upscaled pup:
https://preview.redd.it/8frhiak2dwlb1.png?width=1918&format=png&auto=webp&s=05d647b176764dc34350fa9fa9db5b0d71bc38ab
The network mostly just reconstructs the edges in the image, but doesn't really 'hallucinate' any new detail, so the results are quite pleasing. (Still outperforms FSR1 by a lot from my testing). And it should be able to run in real-time on GPU if it were to be ported...
And here is link to the tool : https://github.com/Panjaksli/BNN/tree/v1.0a
You can try it out, and tell me what you think. Thanks.
/r/deeplearning
https://redd.it/168b0p7
My master's research has beaten state-of-the-art R. I am not sure what to do about it D.
Hello,
My research (Dissertation for MSc in AI) on applying LLMs to drug binding affinity prediction has beaten previous state-of-the-art in single sequence prediction tasks.
My method yields a correlation of 0.7079 for SMILES and 0.7007 for AA-pockets, which improves upon the previous state-of-the-art correlations of 0.485 and 0.501, respectively. The prior state-of-the-art is described and documented in the paper: "Improved Protein−Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference" -> https://pubs.acs.org/doi/10.1021/acs.jcim.0c01306.
However, I don't really know what to do with this information. I did not have a supervisor who lead me with this research who I can discuss this with. This is because my supervisor was in a different field to ML (my university assigned you a supervisor semi-randomly, and I was given someone who focuses on string algorithms) so we agreed and I went down my own path (for the past year) as I really wanted to undertake LLM research. Unfortunately then, no one I know is knowledgeable in the field. My work is currently being marked (I submitted it 2 weeks ago) and I won't get any feedback until November.
My ideas are to put it on ArXiv, but that's it really. As a MSc student I'm still pretty new to research so I'm unsure what to do next. Any advice on what I should do next would be useful
The GitHub to my work can be found here (still a bit of a WIP) https://github.com/jacobmcasey/large-language-models-for-protein-ligand-binding/tree/main
/r/MachineLearning
https://redd.it/169mdnf
Introducing Code Llama, a state-of-the-art large language model for coding
https://ai.meta.com/blog/code-llama-large-language-model-coding/
/r/deeplearning
https://redd.it/1605opp
Fast CV App: Cross Platform Computer Vision Using Multiprocessing
**Why is this relevant to computer vision?**
In my project I show that a pure python app that does 1080p 30fps on both Windows and Mac is possible. It's good for prototyping, for testing (especially if you can just go to a C variant and make it really fast) and I hope in the future, for making "serious" apps.
I'm sharing this because I have never seen anybody talk about using multiprocessing, data compression, and a pure python GUI packaged to windows/mac in the context of computer vision. This might be due to people on reddit/discord/stack exchange just not talking about it but I really do think that this information is just locked to the industry professionals.
This is probably because people don't need it if they have a team of people working on a qt frontend and have another team working on computer vision specifically.
I haven't seen anybody working on this information publicly. All the good stuff is closed source in big corporations:
* examples: Mediapipe's slack channel require a google email: https://github.com/google/mediapipe/issues/779#issuecomment-1101212500
* I DEFINITELY do not have access to instagram filters or very specifically how they apply their filter processing. What I do know is that their more complex filters are not 30 fps at all on mobile phones.
* I can't recall off the top of my head other industry standard pose estimation apps that have open source code/documentation...
**What is my project?**
Here I show with Fast CV App that it is possible and that there is room for improvement. For example, I could "blit buffer" to a shared datatype instead of uploading the whole frame to shared memory, or even convert to YUV so that blit buffer on the kivy frontend is even faster, etc etc.
**How it works**
I gave up on threading because I just could not get mediapipe threading on 1080p frames to hit 30fps. As in the mediapipe docs, it actually drops frames to maintain framerate. I go one step further and actually analyze each frame. I do that by cheating and reading the future frames using opencv/ffmpeg, sending future frames to a multiprocessing subprocess to analyze, then recieve frames in kivy to display at the right time. This is where data compression kicks in, because inter-process communication was hell on this pipeline, taking up ~20-30ms which basically negated the benefits of multiprocessing. This delay made it so that instead of 3-4 subprocesses being sufficient, you needed to run ~6-8 subprocesses which is just not ok. I was stumped on this problem for ~3 months until I realized I could use a compression library like blosc to make the 1080p frames I was sending and receiving go from 6MB to 3.8MB, spending ~5ms on IPC on a task that previously took ~20-30ms. In hindsight, I think this step is actually a basic solution/ probably an industry standard, but all the multiprocessing tutorials never talked about compression so I never thought about it.
A couple tricks/hints:
* try/except blocks using a print(<error message here>, flush=True) was pretty good at catching silent errors from multiprocessing subprocesses
* start your multiprocessing code in AFTER an "if name == main" check or a similar guard so that you don't infinitely spawn subprocesses.
**Fast CV App links**
Github link:
https://github.com/AccelQuasarDragon/FastCVApp
Multiprocessing/Threading Analysis Video:
https://youtu.be/7-UdBUSfafo
Getting Started:
https://youtu.be/YnhHaKEx7pY
Thanks for your time and have a great day, hope this helps even one person out. Good luck!
/r/computervision
https://redd.it/15wdp3o
OpenAI Notebooks which are really helpful.
The OpenAI cookbook is one of the most underrated and underused developer resources available today. Here are 7 notebooks you should know about:
1. Improve LLM reliability:
https://github.com/openai/openai-cookbook/blob/main/techniques\_to\_improve\_reliability.md
2. Embedding long text inputs:
https://github.com/openai/openai-cookbook/blob/main/examples/Embedding\_long\_inputs.ipynb
3. Dynamic masks with DALLE:
https://github.com/openai/openai-cookbook/blob/main/examples/dalle/How\_to\_create\_dynamic\_masks\_with\_DALL-E\_and\_Segment\_Anything.ipynb
4. Function calling to find places nearby:
https://github.com/openai/openai-cookbook/blob/main/examples/Function\_calling\_finding\_nearby\_places.ipynb
5. Visualize embeddings in 3D:
https://github.com/openai/openai-cookbook/blob/main/examples/Visualizing\_embeddings\_in\_3D.ipynb
6. Pre and post-processing of Whisper transcripts:
https://github.com/openai/openai-cookbook/blob/main/examples/Whisper\_processing\_guide.ipynb
7. Search, Retrieval, and Chat:
https://github.com/openai/openai-cookbook/blob/main/examples/Question\_answering\_using\_a\_search\_API.ipynb
Big thanks to the creators of these notebooks!
/r/deeplearning
https://redd.it/15rihgo
D How to stay on the cutting edge of applied ML/AI while doing my PhD?
A lot of my PhD work will be in using different types of ML/NN approaches to characterizing problems in my field. It's kind of weird, since for my undergrad I came from a more traditional science background where we research off papers that were written like 2-20 years ago. Since a lot of these architectures and whatever are updating so fast, I wanted to see if there's a good way to keep up with the latest information so my work wouldn't be outdated by the time I publish. Is there a general workflow that those of you in the field follow in regards to this?
/r/MachineLearning
https://redd.it/15lnt4g
resources to learn about training LLMs?
I'd like to train a mini-LLM on a CPU just to get some experience with LLM training. Do y'all have any resources/links to relevant tutorials? I've looked around myself, but I couldn't find too many in-depth tutorials. I'm also interested in building my own toy LLM from scratch, just for better understanding.
/r/deeplearning
https://redd.it/15j3ls5
D NeurIPS 2023 Paper Reviews
NeurIPS 2023 paper reviews are visible on OpenReview. See this tweet. I thought to create a discussion thread for us to discuss any issue/complain/celebration or anything else.
There is so much noise in the reviews every year. Some good work that the authors are proud of might get a low score because of the noisy system, given that NeurIPS is growing so large these years. We should keep in mind that the work is still valuable no matter what the score is.
/r/MachineLearning
https://redd.it/15fo7td
Attention Is Off By One
https://www.evanmiller.org/attention-is-off-by-one.html
/r/deeplearning
https://redd.it/158xmbw
YoloV8 Body Pose Estimation TensorRT C++ Tutorial (link in comments)
/r/computervision
https://redd.it/156v3e5
Why use ONNX with Triton Inference Server? Why use ONNX in general?
Since Triton can support TensorFlow and PyTorch via torchscript. I was wondering why you would want to convert your model to ONNX? Is it simply to use TensorRT?
Also just wanted to know why use ONNX in general? What are the main advantages?
/r/computervision
https://redd.it/16ogz45
iPhone 15 Stereo Imaging
In yesterday’s keynote event Apple released the iPhone 15 pro max. Apparently you can now take 3d images (only available on the iPhone 15 pro). Well, it uses two of its camera lenses to take two images from slightly different angles to perform stereo imaging - obtaining depth.
So I’m sitting here thinking - every iPhone can do that - right? I’m looking at my iPhone 11 Pro Max thinking about writing up a program in iOS that can utilize two lenses and to take a “3d image.”
Sounds like a doable project right? I did stereo imaging and depth estimation projects for one of my classes so I think I can take on the challenge.
/r/computervision
https://redd.it/16ihrtk
D The ML Papers That Rocked Our World (2020-2023)
Hey everyone! 👋
I’ve been on a bit of a deep-dive lately, trying to catch up on all the awesome stuff that’s been happening in the ML space. It got me wondering, from 2020 to 2023, what have been the absolute must-read papers that shook the foundations and got everyone talking?
Whether it’s something that reinvented the wheel in your specific niche or just made waves industry-wide, I wanna hear about it!
I’m curious to see how different the responses will be, and hey, this might even become a go-to list for anyone looking to get the lowdown on the hottest trends and discoveries of the past few years.
Can’t wait to hear your thoughts!
# tl;dr
I decided to aggregate your best suggestions into categories for anyone interested in reading them without searching through the whole comment section in the future.
## Theoretical:
[Neural Networks are Decision Trees](https://arxiv.org/abs/2210.05189)
Cross-Validation Bias due to Unsupervised Preprocessing
[The Forward-Forward Algorithm: Some Preliminary Investigations](https://arxiv.org/abs/2212.13345)
LoRA: Low-Rank Adaptation of Large Language Models (included here as it has applications beyond LLMs)
[Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets](https://arxiv.org/abs/2201.02177)
## Image:
ViT related:
[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)](https://arxiv.org/abs/2010.11929)
Emerging Properties in Self-Supervised Vision Transformers
[Training data-efficient image transformers & distillation through attention](https://arxiv.org/abs/2012.12877v2)
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
[A ConvNet for the 2020s (a CNN that implements several key components that contribute to the performance of Vision Transformers)](https://arxiv.org/abs/2201.03545)
(CLIP) Learning Transferable Visual Models From Natural Language Supervision
Diffusion related:
High-Resolution Image Synthesis with Latent Diffusion Models
[Denoising Diffusion Probabilistic Models (DDPM)](https://arxiv.org/abs/2006.11239)
Classifier-Free Diffusion Guidance
[Taming Transformers for High-Resolution Image Synthesis (VQGAN)](https://arxiv.org/abs/2012.09841)
Segment Anything (SAM)
[DINOv2: Learning Robust Visual Features without Supervision](https://arxiv.org/abs/2304.07193)
Bayesian Flow Networks
## NLP:
[Language Models are Few-Shot Learners (GPT-3)](https://arxiv.org/abs/2005.14165)
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
[Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
Training Compute-Optimal Large Language Models (Chinchilla)
[The Flan Collection: Designing Data and Methods for Effective Instruction Tuning](https://arxiv.org/abs/2301.13688)
LLaMA: Open and Efficient Foundation Language Models
[Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/abs/2302.04761)
## 3D Rendering:
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
[Highly accurate protein structure prediction with AlphaFold](https://www.nature.com/articles/s41586-021-03819-2)
## Misc:
Human-level play in the game of Diplomacy by combining language models with strategic reasoning
For a well-made and maintained list of ML resources (not only the newest like here) you can check out
How do these nickname tools work?
Hey everyone! I recently came across this interesting nickname generator (it is not the only one). It gave me a surprisingly accurate "japanese viking" name, which piqued my curiosity. From a linguistic perspective, how might such a tool understand and combine linguistic elements to produce coherent and culturally relevant nicknames? Does it consider phonetics, morphology, or other linguistic rules? Would love to get your insights!
/r/LanguageTechnology
https://redd.it/16d781w
Coding LLaMA 2 from scratch in PyTorch, with step by step explanation of KV Cache, Grouped Query Attention, Rotary Positional Embedding, RMS Normalization, SwiGLU and much more!
https://www.youtube.com/watch?v=oM4VmoabDAI
/r/deeplearning
https://redd.it/168onwq
Do you really need a strong Math ( and ML ) knowledge be a NLP engineer ?
Let me explain a bit. I come from a humanities bachelor's degree background, but with a strong passion for linguistics. I wanted to specialize in computational linguistics, but gradually I also became very interested in NLP and jobs related to NLP. That being said, I hope the repressed computer engineers don't show up now lol
I'm about to start a master's degree called “ Digital Humanities” but which is actually only about language technologies. The program includes various subjects like NLP, computational linguistics, data mining, programming, data analysis, etc. However, I know that the Machine Learning (ML) course is fundamental for NLP, but the university's ML course requires strong math foundations, designed for those who have a bachelor's degree in computer science or computer engineering. So, I had thought about giving it up and instead taking the course called “ Computational Intelligence and Deep Learning” that focuses more on topics like fuzzy logic and especially artificial neural networks, RNNs, etc., without requiring initial math foundations.
And maybe adding also an Algorithms class (a good class but not too advanced) to have an additional foundation for NLP.
And then I might study ML on my own through private courses like the one from Stanford on platforms like Coursera.
Or would it be better for me to study the math part (linear algebra, integral and differential calculus, functions) and attempt the ML exam? Keep in mind that I've already taken a statistics course and enjoyed it, but honestly, I don't have that much motivation to study math extensively, especially because I might invest so much effort for none since I might only find jobs like data linguist or computational linguist (given my background in humanistic informatics) where these strong math and ML knowledge are not necessary.
Certainly, my career goal in NLP isn't to engage in researching new algorithms and statistical models, I want to use more my linguistics knowledge in NLP but not only to do annotations.
I've noticed there are many people working more as "NLP engineers" many practical NLP tasks can be accomplished using existing libraries and tools without delving deep into the underlying mathematical concepts and who directly apply algorithms. So obviously you need t know algorithms and deep learning but not too much deep into math research right?
Or would it be better for me to just give up and focus solely on computational linguistics?
/r/LanguageTechnology
https://redd.it/165epjv
Getting data from physical circular chart.
/r/computervision
https://redd.it/162xdyo
Is CV evolving beyond bounding boxes?
Hi all - We (team of Stanford researchers) wrote a new blogpost on "Video Analysis Beyond Bounding Boxes" collecting some of our thoughts on the direction the CV field is heading.
We're actively researching&developing in this space so would love to hear some feedback on this vision for the future of CV and video analysis.
/r/computervision
https://redd.it/15ydds0
Your Neural Network Doesn't Know What It Doesn't Know
Hi everyone,
I made a repo trying to collect every high-quality source for Out-of-distribution detection, ranging from articles and talks for beginners to research papers at top conferences. It also has a primer if you are not familiar with the topic. Check it out and give it a star to support me if you find it helpful. Thanks a lot ;)
https://github.com/continuousml
​
https://preview.redd.it/3dsy0ameoxhb1.png?width=868&format=png&auto=webp&s=4a0c016ab9ad6baeb603bedac1d798572fc41152
/r/computervision
https://redd.it/15q8mx0
Looking for good learning sources around generative AI, specifically LLM
Are there any good video content sources that explains all the concepts associated with generative AI (ex: RL, RLHF, transformer, etc) from the ground up in extremely simple language (using analogies/stories of things that would be familiar to say a 10-12 year old)? Also would prefer channels which explain the concepts in a sequential manner (so that easy to follow) and make short and crisp videos
If yes, could you kindly comment below with the suggestions. If not, could you comment whether something like that would be useful to you and ideally why also?
Big thanks in advance 🙏
/r/deeplearning
https://redd.it/15hdu5v
D Simple Questions Thread
Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
Thanks to everyone for answering questions in the previous thread!
/r/MachineLearning
https://redd.it/15dnok8
Promptify 2.0: More Structured, More Powerful LLMs with Prompt-Optimization, Prompt-Engineering, and Structured Json Parsing with GPT-n Models! 🚀
Hello fellow coders and AI enthusiasts!First up, a huge Thank You for making Promptify a hit with over **2.3k+ stars on Github** ! 🌟Back in 2022, we were the first one to tackle the common challenge of uncontrolled, unstructured outputs from large language models like GPT-3. , and your support has pushed us to keep improving.Today, we're thrilled to share some major updates that make Promptify even more powerful
​
* **Unified Architecture 🧭**: Introducing Prompter, Model & Pipeline Solution
* **Detailed Output Logs 📔**: Comprehensive structured JSON format output within the log folder.
* **Wider Model Support 🤝:** Supporting models from OpenAI, Azure, Cohere, Anthropic, Huggingface and more - think of it as your universal language model adapter.
* **Robust Parser 🦸♂️**: Parser to handle incomplete or unstructured JSON outputs from any LLMs.
* **Ready-Made Jinja Templates 📝:** Jinja prompt templates for NER, Text Classification, QA, Relation-Extraction, Tabular data, etc.
* **Database Integration 🔗**: Soon, Promptify directly to Mongodb integration. Stay tuned!
* **Effortless Embedding Generation 🧬**: Generate embeddings from various LLMs effortlessly with the new update.
Check out the examples and take Promptify for a spin on GitHub. If you like what you see, we'd be honored if you gave us a star!
**Github**: [https://github.com/promptslab/Promptify](https://github.com/promptslab/Promptify)
Thank you again for your support - here's to more structured AI!
from promptify import Prompter,OpenAI, Pipeline
sentence = "The patient is a 93-year-old female with a medical..."
model = OpenAI(api_key)
result = pipe.fit(sentence, domain="medical", labels=None)
Output
[ {"E": "93-year-old", "T": "Age"}, {"E": "chronic right hip pain", "T": "Medical Condition"}, {"E": "osteoporosis", "T": "Medical Condition"}, {"E": "hypertension", "T": "Medical Condition"}, {"E": "depression", "T": "Medical Condition"}, {"E": "chronic atrial fibrillation", "T": "Medical Condition"}, {"E": "severe nausea and vomiting", "T": "Symptom"}, {"E": "urinary tract infection", "T": "Medical Condition"}, {"Branch": "Internal Medicine", "Group": "Geriatrics"}, ]
​
/r/LanguageTechnology
https://redd.it/15dfttb
D Simple Questions Thread
Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead!
Thread will stay alive until next one so keep posting after the date in the title.
Thanks to everyone for answering questions in the previous thread!
/r/MachineLearning
https://redd.it/1518fj5
How essential are strong math and statistics skills for NLP Engineers?
My initial belief was that math and stats would be extremely vital for this field, but I'm seeing some mixed information online. Ironically, Google Bard also was stating that math and stats are not vital. (Though I can't help but think that this is inaccurate).
Can anyone confirm and give some feedback? What are the needed core skills?
/r/LanguageTechnology
https://redd.it/150sew0