46226
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
Dual PatchNorm
The authors propose a new method, Dual PatchNorm, for Vision Transformers which involves adding two Layer Normalization layers before and after the patch embedding layer. Experiments across three datasets show that this method improves the performance of well-tuned ViT models, and qualitative experiments support this.
Paper: https://arxiv.org/abs/2302.01327
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-dual-patch-norm
#deeplearning #cv #transformer
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
CutLER (Cut-and-LEaRn) is a new approach for training unsupervised object detection and segmentation models without using any human labels. It uses a combination of a MaskCut approach to generate object masks and a robust loss function to learn a detector. The model is simple and compatible with different detection architectures and can detect multiple objects. It is a zero-shot detector, meaning it performs well without additional in-domain data and is robust against domain shifts across various types of images. CutLER can also be used as a pretrained model for supervised detection and improves performance on few-shot benchmarks. Results show improved performance over previous work, including being a zero-shot unsupervised detector and surpassing other low-shot detectors with finetuning.
Paper: https://arxiv.org/abs/2301.11320
Code link: https://github.com/facebookresearch/CutLER1
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-cutler
#deeplearning #cv #objectdetection #imagesegmentation
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
In this paper, the authors propose StyleGAN-T, a model designed for large-scale text-to-image synthesis. With its large capacity, stable training on diverse datasets, strong text alignment, and controllable variation-text alignment tradeoff, StyleGAN-T outperforms previous GANs and even surpasses distilled diffusion models, the previous frontrunners in fast text-to-image synthesis in terms of sample quality and speed.
StyleGAN-T achieves a better zero-shot MS COCO FID than current state of-the-art diffusion models at a resolution of 64×64. At 256×256, StyleGAN-T halves the zero-shot FID previously achieved by a GAN but continues to trail SOTA diffusion models.
Paper: https://arxiv.org/abs/2301.09515
Project link: https://sites.google.com/view/stylegan-t?pli=1
A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-stylegan-t
#deeplearning #cv #gan #styletransfer
Nature has published an article with a #superresolution approach for #CT scans.
https://www.sciencedaily.com/releases/2018/03/180321155324.htm
#arxiv: https://arxiv.org/abs/1704.08841
Graph shows what people really mean when they use vague terminology describing the probability of an event.
Читать полностью…
Another paper on automl: Neural Nets learning to design Neural Nets.
A reinforcement learning agent that learns to program new neural network architectures.
Same/better results as LSTMs but with funky nonlinearities (sine, SeLus, etc) and new connections that result in different activation patterns.
Arxiv: https://arxiv.org/abs/1712.07316
Post: https://einstein.ai/research/domain-specific-language-for-automated-rnn-architecture-search
pix2pix Demo: Neural network generates cityscape based on the input label map.
Читать полностью…
Video displaying progress of GANs for photo generation. Now you can use neural networks to generate HD photo of a person who never existed.
https://www.youtube.com/watch?v=XOxxPcy5Gr4
#GAN #youtube
An article about the impossibility of intelligence explosion. There will be no singularity or significant breakthrough and humanity will die off becuase of sun explosion.
francois.chollet/the-impossibility-of-intelligence-explosion-5be4a9eda6ec" rel="nofollow">https://medium.com/@francois.chollet/the-impossibility-of-intelligence-explosion-5be4a9eda6ec
Astonishing results on emotion generation and image altering with StarGAN
Читать полностью…
#CapsNet #tutorial on the YouTube
https://www.youtube.com/watch?v=pPN8d0E3900
#deeplearning
And another posts on #CapsNet and how they work.
Capsule Networks Are Shaking up AI — Here’s How to Use Them: https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952
Understanding Hinton’s Capsule Networks. Part I: Intuition:
pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b" rel="nofollow">https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b
Understanding Hinton’s Capsule Networks. Part II: How Capsules Work:
pechyonkin/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66" rel="nofollow">https://medium.com/@pechyonkin/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66
On 1st of November Geoff Hinton — one of the top NN researches has published two papers introducing new approach for #CV problems: Capsule Networks.
These architecture allows to recognize a face on the picture by detecting eyes, nose, mouth, regardless of the position / scaling / rotating the elements.
In other words, these approach allows neural network to be invariant to transformation of object.
First of papers: https://arxiv.org/abs/1710.09829
Second paper: https://openreview.net/forum?id=HJWLfGWRb&noteId=HJWLfGWRb
Article on Wired: https://www.wired.com/story/googles-ai-wizard-unveils-a-new-twist-on-neural-networks/
Explanation on hackernoon: https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc
Another post with explanation: https://kndrck.co/posts/capsule_networks_explained/
Google's open source candy for all ML community:
Source-to-Source Debuggable Derivatives
https://opensource.googleblog.com/2017/11/tangent-source-to-source-debuggable.html?m=1
#opensource #nn #python #google
The State of Data Science & Machine Learning 2017 by Kaggle.
Very informative article about age, job titles, most popular languages and everything related to DS / ML.
Not to mention that source data is included.
https://www.kaggle.com/surveys/2017
#kaggle #statistics
An interesting perspective here. What if LLMs are viewed though the lens of Microsoft willing to take some part of the search market?
Trends in the dollar training cost of machine learning systems - https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems
The Inference Cost Of Search Disruption – Large Language Model Cost Analysis - https://www.semianalysis.com/p/the-inference-cost-of-search-disruption
The AI Brick Wall – A Practical Limit For Scaling Dense Transformer Models, and How GPT 4 Will Break Past It - https://www.semianalysis.com/p/the-ai-brick-wall-a-practical-limit
Training Compute-Optimal Large Language Models - https://arxiv.org/pdf/2203.15556.pdf
🔥 Dreamix: Video Diffusion Models are General Video Editors
New Google's text-based motion model.
Given a small collection of images showing the same subject, Dreamix can generate new videos with the subject in motion.
Всего из нескольких картинок или ролику новая модель от Google - Dreamix генерирует видео по текстовому описанию!
На видео Dreamix превращает обезьяну в танцующего медведя по промпту «Медведь танцует и прыгает под веселую музыку, двигая всем телом».
⭐️ Project: https://dreamix-video-editing.github.io/
✅️ Paper: https://arxiv.org/pdf/2302.01329.pdf
⭐️ Video: https://www.youtube.com/watch?v=xcvnHhfDSGM
.
ai_machinelearning_big_data
GPT-3 for self-therapy
Just came across an interesting article about using #GPT-3 to analyze past journal entries and summarize therapy sessions for gaining new perspectives on personal struggles. Dan Shipper loaded person journal into the neural network so he could ask different questions, including asking about his own Myers-Briggs personality type (INTJ for those who wondered).
It's a powerful example of how AI tools can help individuals become more productive, effective, and happy. As we continue to see the integration of #AI in various industries, it's important for modern blue collar workers to learn how to properly work with these tools in order to stay at the peak of efficiency.
Let's embrace the future and learn to use AI to our advantage rather than to spread FUD about AI replacing workforce. It won’t but it will enable some people to achieve more and be way more productive.
Link: https://every.to/chain-of-thought/can-gpt-3-explain-my-past-and-tell-me-my-future
#aiusecase #toolsnotactors
“Listing Embeddings for Similar Listing Recommendations and Real-time Personalization in Search”
From #Airbnb team
https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e
Unfortunately, discrimination against ML competition participants becomes more frequent. CrowdANALYTIX recently launched a competition that simply bans different countries from opportunity to participate, this time including Russia.
Spread the word so that we could make Data Science and ML more open, without obsolete discriminatory rules on competition platforms:
https://www.facebook.com/DataChallenges/photos/a.136318350296824.1073741827.136313013630691/182693245659334/?type=3&theater
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs.
Now mankind can generate content for social networks without taking photoes.
Github: https://github.com/NVIDIA/pix2pixHD
Arxiv: https://arxiv.org/pdf/1711.11585.pdf
AI index report, demonstrating hype around AI techonologies: https://aiindex.org/2017-report.pdf
Читать полностью…
#DeepLearning predicts when patients die with Average Precision 0.69 (that’s high).
Andrew Ng announced new project in his twitter: ML to help prioritize palliative (end-of-life) care. Model uses an 18-layer Deep Neural Network that inputs the EHR data of a patient, and outputs the probability of death in the next 3-12 months.
The trained model achieves an AUROC score of 0.93 and an Average Precision score of 0.69 on cross validation.
Site: https://stanfordmlgroup.github.io/projects/improving-palliative-care/
Arxiv: https://arxiv.org/abs/1711.06402
#project #DSinthewild #casestudy
StarGAN — a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model.
GitHub: https://github.com/yunjey/StarGAN
Arxiv: https://arxiv.org/abs/1711.09020
#deeplearning #gan #cv
Realtime object detection by Google.
https://research.googleblog.com/2017/11/automl-for-large-scale-image.html
YouTube demo: https://www.youtube.com/watch?time_continue=70&v=ERglPgx8wFg
#deeplearning #google #caption #detection
An article about #BigBrother. How Facebook is able to track users interests based on 3 likes.
Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals
http://online.liebertpub.com/doi/full/10.1089/big.2017.0074
Imitation learning for structured prediction in natural language processing
https://sheffieldnlp.github.io/ImitationLearningTutorialEACL2017
#nlp #tutorial
Release of a nice NLP-processing library.
https://www.techleer.com/articles/404-spacy-20-released-natural-language-processing-with-python/
#nlp #python
Great example of feature visualisation
https://distill.pub/2017/feature-visualization/